Document Type Definition (DTD)


A Document Type Definition (DTD) describes valid syntax for a class of XML documents. The DTD tells an XML parser (i.e., processor) which tags belong to a document in that class, and in what combination and order they should appear. Using the DTD, the parser checks the document for validity.

Although necessary for an XML document to be valid, a DTD is not mandatory. When a document lacks a DTD, the XML parser cannot verify that the data structure is valid, but it can still attempt to interpret the data.

A DTD can be embedded inside an XML document or referenced and accessed from an external file. Therefore, a single DTD may apply to one document or many. The DTD, if present, must be the first thing in a document after processing instructions and comments. The DTD identifies the root element of the document and may contain additional declarations. Every XML document must have a single root element that contains the entire content of the document.

Fragment of a DTD for e-mail

!element email (head, body)><!element head (from, to+, cc*, subject)><!element from (name?, address)><!element name (#PCDATA)>

This DTD says that each e-mail shall have two parts: a head and a body. The head part shall have a 'from' field, one or more 'to' fields, zero or more 'cc' fields and a 'subject' field. The 'from' field has an optional 'name,' and an 'address.' A 'name' field is just a text string. (NOTE: the #PCDATA keyword declares that the text string for 'name' is parsed character data; that is, character data that contains markup tags. XML processors assume that content in an XML file is parsed character data by default. The exception to this is attribute data, which is generally character data. The CDATA keyworld is used to indicate that an attribute value can consist only of character data that won't be interpreted as markup.)

Copyright 2000 Extensibility, Inc.

Suite 250, 200 Franklin Street, Chapel Hill, North Carolina 27516